24 research outputs found
Greykite: Deploying Flexible Forecasting at Scale at LinkedIn
Forecasts help businesses allocate resources and achieve objectives. At
LinkedIn, product owners use forecasts to set business targets, track outlook,
and monitor health. Engineers use forecasts to efficiently provision hardware.
Developing a forecasting solution to meet these needs requires accurate and
interpretable forecasts on diverse time series with sub-hourly to quarterly
frequencies. We present Greykite, an open-source Python library for forecasting
that has been deployed on over twenty use cases at LinkedIn. Its flagship
algorithm, Silverkite, provides interpretable, fast, and highly flexible
univariate forecasts that capture effects such as time-varying growth and
seasonality, autocorrelation, holidays, and regressors. The library enables
self-serve accuracy and trust by facilitating data exploration, model
configuration, execution, and interpretation. Our benchmark results show
excellent out-of-the-box speed and accuracy on datasets from a variety of
domains. Over the past two years, Greykite forecasts have been trusted by
Finance, Engineering, and Product teams for resource planning and allocation,
target setting and progress tracking, anomaly detection and root cause
analysis. We expect Greykite to be useful to forecast practitioners with
similar applications who need accurate, interpretable forecasts that capture
complex dynamics common to time series related to human activity.Comment: In Proceedings of the 28th ACM SIGKDD Conference on Knowledge
Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA.
ACM, New York, NY, USA, 11 page
Statistical Aspects of High-Dimensional Sparse Artificial Neural Network Models
An artificial neural network (ANN) is an automatic way of capturing linear and nonlinear correlations, spatial and other structural dependence among features. This machine performs well in many application areas such as classification and prediction from magnetic resonance imaging, spatial data and computer vision tasks. Most commonly used ANNs assume the availability of large training data compared to the dimension of feature vector. However, in modern applications, as mentioned above, the training sample sizes are often low, and may be even lower than the dimension of feature vector. In this paper, we consider a single layer ANN classification model that is suitable for analyzing high-dimensional low sample-size (HDLSS) data. We investigate the theoretical properties of the sparse group lasso regularized neural network and show that under mild conditions, the classification risk converges to the optimal Bayes classifier’s risk (universal consistency). Moreover, we proposed a variation on the regularization term. A few examples in popular research fields are also provided to illustrate the theory and methods
The Impact Of Global Unknown Teleconnection Patterns On Terrestrial Precipitation Across North And Central America
Global sea surface temperature (SST) anomalies can affect terrestrial precipitation via ocean-atmosphere interactions known as climate teleconnections. Nonstationary and nonlinear characteristics of the teleconnection signals passing through the complex ocean-atmosphere-land system may provide a unique opportunity to quantify large-scale climate variability. This work explores the systematic relationships between global SST anomalies and terrestrial precipitation variability with respect to long-term nonlinear and nonstationary teleconnection signals during 1981–2010 over three regions in North America and one in Central America. The aim of this study was to investigate the surveillance capacity of teleconnections through varying atmospheric pathways toward different types of landscape and geographical environments. After finding possible associations between the dominant variation of seasonal precipitation and global SST anomalies through the integrated empirical mode decomposition, wavelet analysis, and lagged correlation analysis, the statistically significant SST regions were extracted to identify both known and unknown teleconnections. Results indicate that previously unidentified SST regions contribute a salient portion of terrestrial precipitation variability over different terrestrial regions. Central America and Pacific Northwest study sites receive highest probable impacts of climate variability driven by some unknown teleconnections that reveal unique coupling interactions between oceanic and atmospheric processes, implying possible linkages with atmospheric rivers